Skip to content

[6/n] catalog ray serve env vars#60807

Open
harshit-anyscale wants to merge 2 commits intomasterfrom
env-vars-cataloging
Open

[6/n] catalog ray serve env vars#60807
harshit-anyscale wants to merge 2 commits intomasterfrom
env-vars-cataloging

Conversation

@harshit-anyscale
Copy link
Contributor

@harshit-anyscale harshit-anyscale commented Feb 6, 2026

Summary

This PR introduces two improvements to Ray Serve's router configuration:

  1. Document existing env vars: Added documentation for three undocumented queue length probing environment variables:

    • RAY_SERVE_QUEUE_LENGTH_RESPONSE_DEADLINE_S
    • RAY_SERVE_MAX_QUEUE_LENGTH_RESPONSE_DEADLINE_S
    • RAY_SERVE_QUEUE_LENGTH_CACHE_TIMEOUT_S
  2. Migrate backoff env vars to config: Makes three router backoff environment variables configurable via RequestRouterConfig. The env vars are marked as deprecated with warnings guiding users to the new config options.

    Environment variables being migrated:

    • RAY_SERVE_ROUTER_RETRY_INITIAL_BACKOFF_Srequest_router_config.initial_backoff_s
    • RAY_SERVE_ROUTER_RETRY_BACKOFF_MULTIPLIERrequest_router_config.backoff_multiplier
    • RAY_SERVE_ROUTER_RETRY_MAX_BACKOFF_Srequest_router_config.max_backoff_s

Changes

Documentation (performance.md)

  • Added section "Set timeouts while probing replicas for queue length" documenting the three existing queue length env vars

Config Layer (config.py, serve.proto)

  • Added three new fields to RequestRouterConfig: initial_backoff_s, backoff_multiplier, max_backoff_s
  • Default values fall back to env vars for backwards compatibility
  • Added corresponding protobuf fields for config serialization across the cluster

Router Layer (router.py, request_router.py)

  • AsyncioRouter.update_deployment_config() extracts backoff params from config and stores them
  • Backoff params are passed to RequestRouter.__init__() when the router is created
  • RequestRouter stores params as instance variables and uses them in retry logic

Deprecation Warnings (constants_utils.py, router.py)

  • Added env vars to _fully_deprecated_env_vars dictionary
  • Emit DeprecationWarning when deprecated env vars are set, guiding users to use config instead

Signed-off-by: harshit <harshit@anyscale.com>
@harshit-anyscale harshit-anyscale requested a review from a team as a code owner February 6, 2026 11:29
@harshit-anyscale harshit-anyscale self-assigned this Feb 6, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully moves the router retry backoff configuration from environment variables to RequestRouterConfig, deprecating the old env vars. The changes are well-tested and correctly implemented across the codebase.

One point of feedback: the documentation in doc/source/serve/advanced-guides/performance.md for the deprecated RAY_SERVE_ROUTER_RETRY_* environment variables has not been updated. It should be updated to reflect that these variables are deprecated and guide users to use the new RequestRouterConfig options.

I've also suggested some minor improvements to the descriptions of the new config options in python/ray/serve/config.py to make the deprecation clearer to users.

Comment on lines 256 to 260
description=(
"Initial backoff time (in seconds) before retrying to route a request "
"to a replica. Defaults to RAY_SERVE_ROUTER_RETRY_INITIAL_BACKOFF_S "
"environment variable, or 0.025 if not set."
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description is a bit confusing. To make it clearer that the environment variable is being deprecated, consider rephrasing it. For example:

"Initial backoff time (in seconds) before retrying to route a request to a replica. Defaults to 0.025. This can be overridden by the deprecated `RAY_SERVE_ROUTER_RETRY_INITIAL_BACKOFF_S` environment variable."

This applies to backoff_multiplier and max_backoff_s as well.

        description=(
            "Initial backoff time (in seconds) before retrying to route a request "
            "to a replica. Defaults to 0.025. This can be overridden by the deprecated `RAY_SERVE_ROUTER_RETRY_INITIAL_BACKOFF_S` environment variable."
        ),

Comment on lines 265 to 269
description=(
"Multiplier applied to the backoff time after each retry. "
"Defaults to RAY_SERVE_ROUTER_RETRY_BACKOFF_MULTIPLIER "
"environment variable, or 2 if not set."
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To make it clearer that the environment variable is being deprecated, consider rephrasing the description. For example:

"Multiplier applied to the backoff time after each retry. Defaults to 2. This can be overridden by the deprecated `RAY_SERVE_ROUTER_RETRY_BACKOFF_MULTIPLIER` environment variable."
        description=(
            "Multiplier applied to the backoff time after each retry. Defaults to 2. "
            "This can be overridden by the deprecated `RAY_SERVE_ROUTER_RETRY_BACKOFF_MULTIPLIER` environment variable."
        ),

Comment on lines 274 to 278
description=(
"Maximum backoff time (in seconds) between retries. "
"Defaults to RAY_SERVE_ROUTER_RETRY_MAX_BACKOFF_S "
"environment variable, or 0.5 if not set."
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To make it clearer that the environment variable is being deprecated, consider rephrasing the description. For example:

"Maximum backoff time (in seconds) between retries. Defaults to 0.5. This can be overridden by the deprecated `RAY_SERVE_ROUTER_RETRY_MAX_BACKOFF_S` environment variable."
        description=(
            "Maximum backoff time (in seconds) between retries. Defaults to 0.5. "
            "This can be overridden by the deprecated `RAY_SERVE_ROUTER_RETRY_MAX_BACKOFF_S` environment variable."
        ),

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

double backoff_multiplier = 7;

// Maximum backoff time (in seconds) between retries.
double max_backoff_s = 8;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR lacks description explaining changes

Low Severity

⚠️ This PR needs a clearer title and/or description.

To help reviewers, please ensure your PR includes:

  • Title: A concise summary of the change
  • Description:
    • What problem does this solve?
    • How does this PR solve it?
    • Any relevant context for reviewers such as:
      • Why is the problem important to solve?
      • Why was this approach chosen over others?

See this list of PRs as examples for PRs that have gone above and beyond:

Fix in Cursor Fix in Web

Signed-off-by: harshit <harshit@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant